fix: Further improve performance of the UTF-8 string comparison logic#2182
Merged
fix: Further improve performance of the UTF-8 string comparison logic#2182
Conversation
The semantics of this logic were originally fixed by #1967, but this fix caused a material performance degradation, which was then improved by #2021. The performance was, however, still suboptimal, and this PR further improves the speed back to close to its original speed and, serendipitously, simplifies the algorithm too. This commit effectively ports the following two PRs from the firebase-android-sdk repository: - firebase/firebase-android-sdk#7098 - firebase/firebase-android-sdk#7109
wu-hui
approved these changes
Jul 7, 2025
This was referenced Dec 15, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR both improves performance and simplifies the UTF-8 string comparison logic. It addresses prior performance regressions by introducing a more optimized algorithm for string ordering.
The semantics of the UTF-8 string comparison logic were originally fixed by #1967, but this fix caused a material performance degradation, which was then improved by #2021. The performance was, however, presumably still sub-optimal, and this PR further improves the speed back to close to its original speed and, serendipitously, simplifies the algorithm too.
This PR effectively ports the following two PRs from the firebase-android-sdk repository:
Highlights
compareUtf8Strings()method inOrder.javahas been rewritten to improve performance and simplify its logic. The new algorithm leverages the relationship between UTF-8 and UTF-16 representations for more efficient string comparison, avoiding costly byte string conversions.codePointAtandByteString.copyFromUtf8for non-ASCII characters has been replaced with a more straightforward character-by-character comparison that intelligently handles surrogate pairs.OrderTest.javahas been updated to improve the robustness of thecompareUtf8Strings()test. Instead of asserting exact return values, it now checks thesignum(sign) of the comparison result, which is a more appropriate way to validate comparator behavior.